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ABSTRACT 


To date, an effective therapeutic treatment that con- 
fers strong attenuation toward coronaviruses (CoVs) 
remains elusive. Of all the potential drug targets, 
the helicase of CoVs is considered to be one of 
the most important. Here, we first present the struc- 
ture of the full-length Nsp13 helicase of SARS-CoV 
(SARS-Nsp13) and investigate the structural coor- 
dination of its five domains and how these con- 
tribute to its translocation and unwinding activity. 
A translocation model is proposed for the Upft- 
like helicase members according to three differ- 
ent structural conditions in solution characterized 
through H/D exchange assay, including substrate 
state (SARS-Nsp13-dsDNA bound with AMPPNP), 
transition state (bound with ADP-AIF,~) and product 
state (bound with ADP). We observed that the B19- 
820 loop on the 1A domain is involved in unwinding 
process directly. Furthermore, we have shown that 
the RNA dependent RNA polymerase (RdRp), SARS- 
Nsp12, can enhance the helicase activity of SARS- 
Nsp13 through interacting with it directly. The inter- 
acting regions were identified and can be considered 
common across CoVs, which provides new insights 
into the Replication and Transcription Complex (RTC) 
of CoVs. 


INTRODUCTION 


The emergence of Severe Acute Respiratory Syndrome 
coronavirus (SARS-CoV) in 2003 was the first opportunity 
to allow investigation of a coronavirus (CoV) that was a 
severe human pathogen. A decade later, a similar coron- 
avirus termed Middle East Respiratory Syndrome Coron- 
avirus (MERS-CoV) emerged, but alarmingly this virus has 
higher case-fatality rates than SARS-CoV. Thus, there is a 
refocussing of the world’s attention onto CoVs. The fact 
that no therapeutic treatments are available for CoVs is a 
serious concern (1,2). It is therefore necessary to study the 
life cycle of CoVs to develop new ideas for effective vaccines 
or drugs. 

SARS-CoV belonging to the genus Betacoronavirus in 
the family Coronaviridae has one of the largest known 
RNA genomes (~29.7 kb) among RNA viruses. Two large 
polyproteins ppla and pplab are encoded by this genome. 
After being proteolytically processed, 16 non-structural 
proteins (Nsps) are produced including primase (Nsp8), 
RNA-dependent RNA polymerase (Nsp12) and helicase 
(Nsp13). These three enzymes and other Nsps are com- 
ponents of a replication and transcription complex (RTC) 
which is essential for the life cycle of SARS-CoV (3,4). 

Helicase SARS-CoV Nsp13 (SARS-Nsp13) plays a vital 
role in catalyzing the unwinding of duplex oligonucleotides 
into single strands in an NTP-dependent manner. Impor- 
tantly, SARS-Nsp13 has been identified as an ideal target 
for the development of anti-viral drugs due to its sequence 
conservation and indispensability across all CoV species (5— 
7). SARS-Nsp13 has been characterized as belonging to su- 
perfamily 1 (SF1) of the six helicase superfamilies which are 
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classified on the basis of several conserved motifs and can 
unwind both RNA and DNA duplexes in the 5’ to 3’ direc- 
tion (8). The associated NTPase activity can target all nat- 
ural nucleotides and deoxynucleotides as substrates (9,10). 
Moreover, it has been shown that SARS-Nsp12 can en- 
hance the helicase activity of SARS-Nsp13 by increasing 
the step size of nucleic acid (dsRNA or dsDNA) unwind- 
ing by 2-fold (11). However, how the SARS-Nsp12 increase 
its helicase activity and if the NTPase activity is also influ- 
enced remains unclear. 

Structures of helicases from SF1 are available, amongst 
which the Upfl, eukaryotic RNA helicase essential for 
nonsense-mediated mRNA decay (NMD) signal pathway 
and Nsp10, the helicase of equine arteritis virus (EAV) 
share many structural features (12,13). SARS-CoV Nsp13 
is also a Upfl-like helicase. However, until recently when 
the MERS-CoV Nsp13 was solved, no structural informa- 
tion for this coronavirus helicase was available despite bio- 
chemical characterization and the determination of kinetic 
parameters associated with its helicase or NTPase activity 
(14). The structure of MERS-CoV helicase in the absence 
of nucleotide and substrate was reported to have four do- 
mains, an N-terminal CH domain, two helicase core do- 
mains RecAl and RecA2 and an inserted domain 1B. In 
addition, there is a ‘stalk’ region which connects the CH 
domain and 1B domain. However, how the five domains co- 
operate to contribute to the helicase function remains unde- 
fined. 

Here, we first present the structure of the full-length 
SARS-CoV Nsp13 (SARS-Nsp13). The five domains in- 
cluding zinc-binding domain, stalk domain, 1B domain, 
1A domain and 2A domain are shown to coordinate with 
each other to complete the final unwinding process. Heli- 
cases have been characterized as translocases as the unwind- 
ing activity can be the result of it translocating on single- 
stranded oligonucleotides (15). We demonstrate how the 1A 
and 2A domains coordinate with each other when SARS- 
Nsp13 translocates on ssDNA through observing the H/D 
exchanges conditions of three states of SARS-Nsp13 with 
different ligands bound including ATP analog (AMPPNP), 
ADP-AIF4— and ADP. Moreover, we show that SARS- 
Nsp13 can interact with SARS-Nsp12 with high affinity 
and identified the key interaction domain on SARS-Nsp13, 
which provides us with insight into the RTC of SARS-CoV. 


MATERIALS AND METHODS 
Protein expression and purification 


The full-length helicase SARS-Nsp13 (1-601laa) was en- 
coded by nucleotides (GenBank accession no. AY291315) 
of the SARS-CoV genome from strain Frankfurt 1 and was 
inserted into the modified pET-28a vector at NcoI/Xhol 
restriction sites with a hexa-histidine tag attached at its N- 
terminal end. BL21(DE3) cells were then transformed by 
introduction of this plasmid. After enlarging the reproduc- 
tion volume of competent cells, the target gene was over- 
expressed. Cells were grown at 37°C and induced with 200 
uM IPTG when the OD value reached ~0.8. Thereafter, the 
induced cells were transferred to 18°C to grow for 12-16 h. 
Cells were harvested at 4500 rpm by centrifugation at 4°C. 
After ultrasonification and centrifugation at 14 000 rpm, the 


supernatant was run through a Ni-affinity column and the 
protein eluted with 200 mM imidazole. The eluate was then 
further purified by ion exchange column Hitrap S and size- 
exclusion chromatography (Superdex 200, GE Healthcare). 


Crystallization, data collection and structure determination 


The protein solution was collected and concentrated to 6.7 
mg/ml and then incubated with 25 thymine single-stranded 
DNA(dT25) at a molar ratio of 1:1.2 and then incubated 
with 2 mM AMPPNP at 4°C for 3 h. The hanging-drop 
vapor-diffusion method was used to grow the Nsp13 crys- 
tals. The conditions for optimal crystal growth were 12% 
(w/v) polyethylene glycol 20 000, 2 M ammonium sulfate 
and 0.1 M MES monohydrate pH 6.5 at 16°C. The protein 
and this crystallization buffer were mixed in equal volumes. 

All diffraction data sets were collected on beam- 
line BL19U at Shanghai Synchrotron Radiation Facility 
(SSRF). Data was indexed, integrated and scaled with XDS 
(16). Single-wavelength anomalous data were collected at 
the zinc absorption edge and SHELXD was used to locate 
the six zinc atoms (17). The density map was improved with 
solvent flattening module of PHENIX program (18). The 
initial model was manually built in COOT (19) and further 
refined in PHENIX. The final 153 residues (443—596) were 
fitted using molecular replacement module in PHENIX 
with the equivalent residues of MERS-Nsp13. The struc- 
ture was refined to 2.8 A resolution. 

Data collection and processing statistics are summarized 
in Table 1. 


Surface plasmon resonance (SPR) assay 


100 ul of 20 pg/ml SARS-Nsp12 in sodium acetate buffer 
pH 4.5 was prepared to be amino coupled onto channel 2 
of a CMS chip and fixed through addition of 100 wl ETA in 
water. A gradient of SARS-Nsp13 was set up from 0.39 yM 
to 3.12 uM for four cycles of binding data measurements. 
5 mM NaOH buffer was used for regeneration of the chip. 
The contact time and dissociation time were each set to 60 s. 
The experimental data and fitting data were processed with 
GraphPad Prism. 


Nucleic acid unwinding assay 


As for the helicase activity, dsDNA (5/-AATGTCTGAC 
GTAAAGCCTCTAAAATGTCTG-3’-BHQ, CY3-5’-CA 
GACATTTTAGAGG-3’) was used where the excitation 
wavelength was set to 547 nm and emission wavelength was 
set to 562 nm to detect fluorescence of CY3. 200 nM Nsp13 
(final concentration) was added to the reaction buffer (50 
mM HEPES 7.5, 20 mM NaCl, 4 mM MgCh, 1 mM DTT, 
0.1 mg/ml BSA) to incubate with dsDNA and 20 uM trap 
ssDNA for 5 min. Then 2 mM ATP (final concentration) 
was added to initiate the helicase activity and the fluores- 
cence value was recorded by Perkin-Elmer Envision. 


ATPase assay 


The ATPase assay was performed using the two enzyme 
coupling method as follows. The two enzymes, pyruvate ki- 
nase and L-lactate dehydrogenase (from Sigma) were added 
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Table 1. Data collection and refinement statistics 


Parameters (data collection statistics) SARS-Nsp13 
Data collection statistics 

Cell parameters 

a(A) 192.0 

b (A) 189.2 

c(A) 57.3 

a, B, y (°) 90.0, 102.9, 90.0 
Space group g C2 
Wavelength used (A) 0.9798 
Resolution (A) 50.0-2.69 (2.72-2.69)° 
No. of all reflections 222 454 (9905) 
No. of unique reflections 21 016 (1065) 
Completeness (%) 100.0 (100.0) 
Average I/o(I) 10.6 (2.33) 
Rmerge* (%) 10.4 (69.9) 
Refinement statistics 

No. of reflections used (ao (F) > 0) 48 632 

Ryork? (%) 23.75 

Rfree® (%) , 29.25 

r.m.s.d. bond distance (A) 0.014 

r.m.s.d. bond angle (°) 1.811 

Average B-value (A2) 61.49 

No. of protein atoms 9,335 

No. of ligand atoms 0 

No. of solvent atoms 149 
Ramachandran plot 

res. in favored regions (%) 79.52 

res. in allowed regions (%) 15:51 

res. in outlier regions (%) 4.97 


a Rmerge = Un Elin — In |/Xp Up, where J), is the mean of observations Ji, 
of reflection A. 

> Rwork = X(ILF,(obs)| — LF, (calc)ll)/ ZF, (obs). Rfree is an R factor for a 
pre-selected subset (5%) of reflections that was excluded in the refinement. 
“Numbers in parentheses are corresponding values for the highest resolu- 
tion shell. 


to the 200 ul reaction buffer system with final concentra- 
tion as 100 units/ml and 200 units/ml respectively. Phos- 
phoenolpyruvate (PEP, from SIGMA) and NADH were 
also added as co-factors for enzyme coupling. 20nM Nsp13 
was incubated with a ssDNA in a buffer containing 50 mM 
MOPS pH 7.0, 10 mM MgCl, and 50 mM NaCl for 5 min. 
Varying concentrations of ATP were then added to initiate 
the reaction and the OD349 nm value of NADH was mea- 
sured using the Perkin-Elmer Envision. The Km value was 
calculated using GraphPad Prism. 


Hydrogen/deuterium exchange mass spectrometry (H/DX 
MS) 


H/DX MS isan established method for protein-protein and 
protein-DNA interaction detection on peptide level (20). 
The exchange of amide backbone hydrogens for deuterons 
is monitored by mass spectrometry, and can be localized to 
specific peptides within the primary structure, upon prote- 
olytic digestion. It can report on interaction related regions 
with backbone amide groups involved in interactions or se- 
questration of regions of a protein in a solvent-inaccessible 
hydrophobic core, as deuteration will require both exposure 
of the region to solvent and fluctuations in hydrogen bond- 
ing (21). 
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For the HD/X experiment, 5 pl of 0.7 ug/ul SARS- 
Nsp13 alone or in the presence of ligand was prepared. The 
PH of the buffer was 7.0, which contained 20 mM HEPES, 
150 mM NaCl and 4 mM MgCl». To initiate deuterium la- 
beling, 5 wl of each 2 wg/pl protein solution was diluted 
with 45 ul of labeling buffer (contents, 99% D20, pH 6.6) 
at 25 °C for 10 min, and 50 pl of ice-cold quench buffer 
(1% (v/v) formic acid in water solution, 100% H20) was 
added to quench the labeling. The reaction tube was then 
put on ice. 10 pl of 1 uM pepsin solution was added for 
digestion. After 5 min, the sample was centrifugated and 
placed into the auto-sampler of the Ultimate 3000 UPLC 
system (Thermo, CA, USA) for injection. 50 wl of sam- 
ple was then loaded onto and separated by a ACQUITY 
UPLC 1.7 pm BEH C18 1.0 um x 50 mm column (Wa- 
ters). A 1-50% gradient of acetonitrile over 37 min at a 
flow rate of 100 pl/min was used to separate peptides. 
Both chromatographic mobile phases contained 0.1% (v/v) 
formic acid. Mass spectrometry analysis was performed on 
Q Exactive Orbitrap mass spectrometer (Thermo, CA). The 
hydrogen/deuterium exchange difference of each peptide 
between protein alone and protein with ligand was manu- 
ally checked. 


Electrophoretic mobility shift assay (EMSA) 


To screen for an optimal substrate for Nsp13, we tested >20 
nucleic acids, most of which originated from the sequence of 
the SARS genome. We conducted an electrophoretic mo- 
bility shift assay (EMSA) to see which nucleic acid had the 
strongest binding affinity for Nsp13. Finally, we identified 
one which originated from the first 38 nucleotides of Nsp7 
of the SARS genome, named 7F (CATGCCATGGCCTC 
TAAAATGTCTGACGTAAAGTGCACATCTGT). The 
dsDNA we used in the helicase activity assay was also de- 
rived from 7F. 

EMSA assay was performed to detect both the nucleic 
acid binding ability and unwinding activity. The buffer used 
for binding to incubate SARS-Nsp13 with nucleic acids 
contained 20 mM HEPES 7.0, 50 mM NaCl and 5 mM 
MgCl. while the buffer used for unwinding contained 20 
mM HEPES 7.0, 50 mM NaCl, 5mM MgCh, 0.1 mM DTT 
and 0.1 mg/ml BSA. After 30 min of incubation, 10x load- 
ing buffer consisting of 50% glycerol and 200 mM HEPES 
was added to each sample to prepare for the electrophoretic 
mobility (110 V), where the running buffer consisted of 25 
mM Tris and 192 mM glycine (pH 8.2). 


Docking the dsDNA on SARS-Nsp13 structure 


NPDock server (22) docking program was used to predict 
the interactions between ATP—nsp13 complex and DNA. 
The double helix DNA fragments were constructed with the 
Nucleic Acid Builder (23) and used as the initial structures 
for the docking. NPDock performed global macromolecu- 
lar docking with the default parameters and generated the 
20 000 models. RMSD cut-off was set to 5 A for cluster- 
ing of the best-scored models. After refinement of protein— 
DNA contacts in the models, the best scoring models from 
the three largest clusters were selected for further analysis. 
And the score for dsDNA-—Nsp13 complex model is —9.59, 
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which demonstrated that the model was confidential. Fi- 
nally, the best pose with the highest probability was used 
to demonstrate the interaction between SARS-Nsp13 and 
dsDNA in PyMOL. 


RESULTS 
Overall structural description of SARS-Nsp13 


Full-length SARS-Nsp13 (residues 1-601) was expressed in 
Escherichia coli with a hexa-histidine tag at the N-terminus. 
We confirmed that the SARS-Nsp13 expressed in E. coli 
can function normally with helicase and ATPase activity 
through nucleic acid unwinding assay and ATP hydrolysis 
assay (Figure 1C and D). 

Similar to MERS-Nsp13, the overall structure of Nsp13 
assumes a triangular pyramid shape consisting of five do- 
mains. Three domains including the two ‘RecA-like’ do- 
mains named 1A and 2A and the 1B domain are arranged 
to form the triangular base, leaving the remaining two 
domains including the N-terminal Zinc binding domain 
(ZBD) and the stalk domain directed towards the apex 
of the pyramid. The ZBD and 1B domains are connected 
through the stalk domain (Figure 1A). 

There are two SARS-Nsp13 molecules in the asym- 
metric unit with the ZBD domain providing the interac- 
tion interface (Figure 1B). However, SARS-Nsp13 retains 
a monomer alone or with dsDNA (Supplementary Fig- 
ure S1). Besides, the ZBD domains of two MERS-Nsp13 
molecules do not contact each other at all (Figure 1B), 
indicating that the arrangement of the two SARS-Nsp13 
molecules can be attributed to crystal packing. 


The NTP hydrolysis active site 


It has been shown that SARS-Nsp13 is an NTP-dependent 
SF1 helicase member. To better understand the NTPase 
active site, we identified six key residues involved in NTP 
hydrolysis through superimposition of SARS-Nsp13 with 
Yeast-Upfl-ADP-AIF4~ (2XZL) (24), also belonging to 
SF1 family. The six residues, K288, S289, D374, E375, Q404 
and R567, are highlighted in green color and cluster to- 
gether in the cleft at the base between the 1A and 2A do- 
mains (Figure 2A). 

Five of these residues, the exception being $289, are also 
conserved in MERS-Nsp13 and involved in nucleotide hy- 
drolysis in the absence of a functional assay confirmation 
(14) (Supplementary Figure S2). Here, we tested the helicase 
activity on six single mutants. The results (Figure 2D—G) 
showed that the six mutants display highly unwinding defi- 
ciency. And the ATPase activity of the six mutants also de- 
creased a lot as expected, amongst which S289A possessed 
the highest yet still much lower ATPase activity than that 
of wild type SARS-Nsp13 (WT-Nsp13) (Figure 2B). Initial 
ATP hydrolysis velocities of the six mutants are almost the 
same and lower than 50% of that of WT-Nsp13 (Figure 2C). 
The helicase activity of all six mutants are consistent with 
their ATPase activity indicating that the unwinding activity 
of SARS-Nsp13 is ATP hydrolysis dependent. 

For MERS-Nsp13, Y442 has been proposed to stabilize 
the adenosine base of nucleotides (14). In SARS-Nsp13, it 


is replaced by arginine. So, here we made the R442A mutant 
and demonstrated that it has almost the same or even higher 
unwinding ability (Supplementary Figure $3). Thus, R442 
does not participate directly in ATP hydrolysis. 


Regions critical for double-stranded DNA (dsDNA) binding 


To date, no helicase structure with a nucleic acid substrate 
for CoVs has been solved. In order to get detailed struc- 
tural information about the nucleic acid binding regions, we 
docked a dsDNA on SARS-Nsp13 based on EAV-Nsp10- 
RNA structure (4NOO) (13) and Yeast-Upfl- RNA struc- 
ture (2XZL) (24) (Figure 3A). It is evident that the radius 
of the channel formed by the 1B, 1A and 2A domains is not 
wide enough for dsDNA to pass through. 

According to the artificial complex, residues 176-186 (1B 
domain), 209-214 (1B domain), 330-350 (1A domain) and 
516-541 (2A domain) are the most probable nucleic acid 
binding regions (Figure 3A). We incubated SARS-Nsp13 
with a 3-fold molar excess of dsDNA screened out from 
SARS-CoVs genome for H/D exchange assay. Four pep- 
tides that emerged from analyzing the results of H/D ex- 
change assay all matched with the above four regions and 
demonstrated less H/D exchanges, indicating that they may 
possess dsDNA interacting amino acids (Figures 3B-E). 

To further testify whether these regions affect the nucleic 
acid binding affinity, we constructed six double mutants in- 
cluding N179A/R212A, R337A/R339A, R507A/K508A, 
K524A/Q531A, K345A/K347A and S539A/YS41A and 
detected their binding activity for dsDNA using EMSA 
(Figure 4A and B). All the mutants exhibited binding de- 
ficiency except for mutant R507/K508 which showed al- 
most the same binding affinity as WT SARS-Nsp13 (WT- 
13) (Figure 4C). We supposed that it is because of the special 
location of the two residues, R507 and K508. They are both 
situated on the surface of 2A domain and exposed to solu- 
tion. When the two hydrophilic residues were mutated to 
alanines, the two hydrophobic alanines can cause the insta- 
bility of the whole structure of SARS-Nsp13. The instability 
can be demonstrated through the DSF experiment (Supple- 
mentary Figure S5). 

Furthermore, the helicase activity of all the six mutants 
suffered different degrees of deficiency, among which the 
mutant K345A/K347A exhibited the lowest activity with 
the two residues K345 and K 347 belonging to the B19—B20 
loop on domain 1A. 

To discriminate whether the ten residues affected the he- 
licase activity indirectly through involvement in the bind- 
ing process or directly through involvement in the unwind- 
ing process, we incubated the SARS-Nsp13 with dsDNA 
and ssDNA separately for further H/D exchange assay. In- 
triguingly, amongst the four nucleic acid binding related 
peptides, only peptides 331-357 forming the B19-820 loop 
where residues R337, R339, K345 and K347 are situated 
displayed no pattern shift when incubated with ssDNA. It is 
obvious that the 819-820 loop on 1A domain plays a crit- 
ical role during the unwinding process in a yet unknown 
way rather than binding and it does not necessarily partici- 
pate in binding ssDNA when SARS-Nsp13 translocates on 
it (Supplementary Figure S4). 
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Figure 1. Overall structure of SARS-Nsp13. (A) Ribbon structure of SARS-Nsp13 is composed of ZBD (lime), stalk (yelloworange), 1B (salmon), 1A 
(aquamarine) and 2A (palecyan) domains. Three zinc atoms are shown as dark red spheres and schematic diagram of the domain organization of SARS- 
Nsp13. (B) Up, the crystal packing arrangement of two SARS-Nsp13 molecules. Down, the crystal packing arrangement of two MERS-Nsp13 molecules 
(SWWP). (C) The ATPase activity of SARS-Nsp13. The final concentration of SARS-Nsp13 was 25 nM. The gradient concentration of substrate ATP 
were 0.08, 0.1, 0.25, 0.5, 1.0, 1.25, 2.0 mM. The calculated Vo corresponding to each ATP concentration was plotted against the ATP concentration fitting 
the Michaelis-Menten function. The final Vmax is 0.4845 + 0.01311 wM/min. Km = 0.1552 + 0.01693 mM. (D) The unwinding activity of SARS-Nsp13. 
Up, 20 nM SARS-Nsp13 was incubated with dsDNA for 1, 5 and 10 min. Down, SARS-Nsp13 of different concentrations were incubated with dsDNA 
for 1 min. The dsDNA (5’-CAGACATTTTAGAGG-3’-CY 3, 5/-AATGTCTGACGTAAAGCCTCTAAAATGTCT-3’) used in the assay is labelled with 


CY3. 


How can the 1A and 2A domains function on ssDNA along 
with ATP hydrolysis? 


The nucleic acid binding channel and the nucleotide bind- 
ing pocket have been verified through structural alignments 
and biochemical assays. To further understand how the 1B, 
1A and 2A domains are involved in nucleic acid binding 
along with ATP hydrolysis, we artificially created four con- 
ditions where each signified a single structural state in a 
ATP hydrolysis cycle, Nsp13-dsDNA and Nsp13-dsDNA 
incubated with various ligands including ATP (AMPPNP), 
ADP-AIF,~ and ADP. We performed the H/D exchange 
assay to check the shift patterns of five samples (plus ligand- 
free SARS-Nsp13). The relative shift D values of all the pep- 
tides with changed patterns are displayed on Supplemen- 
tary Table S1. 

The four conditions of SARS-Nsp13 with substrates are 
illustrated with different colors where red regions repre- 


sented less H/D exchange (tightening) and blue regions rep- 
resented more H/D exchange (loosening) compared to the 
previous condition in the ATP hydrolysis cycle. From this 
analysis, we can thus visualize the tightening or loosening 
state of specific regions in comparison to its previous state, 
gaining insights into the dynamic changes that are occur- 
ring (Figure 5). 

We set the SARS-Nsp13-dsDNA as the initial state in 
the cycle with the whole structure in grey color. (Figure 
5A). When ATP analog enters the active pocket, the 1A 
and 2A domains are pulled together. This is based on the 
fact that there is a reduced number of H/D exchanges in 
the region 427-437 (Supplementary Table S1). The region 
427-437 (demonstrated in red in Supplementary Figure S6) 
is located in between the 1A and 2A domains. When we 
added the AMPPNP molecules in the incubation system, 
there were less H/D exchanges in this area, which indicated 
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Figure 2. The active pocket composed of ATPase related residues. (A) Left, superposition between Yeast-Upfl-ADP-AIF4~ (2XZL) (24) in grey and 
SARS-Nsp13 in green. Right, The stick model of all the ATPase related residues. The ADP-AIF4~ is from the Upfl-ADP-AIF4~ complex structure not 
the SARS-Nsp13 structure. AIF4~ is presented in cyan while the ADP molecule in salmon. All the residues are presented in color by element with S 
atoms in orange, O atoms in red, N atoms in blue and H atoms in tints. (B) ATPase activity of all the ATP hydrolysis related mutants. The initial ATP 
concentration is 150 uM and the protein concentration is 25 nM. The changes of percentage of hydrolyzed ATP over time is demonstrated. The fitting 
function is one-phase association in the GraphPad Prism program. (C) Initial ATP hydrolysis velocities of WT-Nsp13 and six mutants under 150 uM 
ATP concentration are 0.3952 + 0.05841 M/s (WT-Nsp13), 0.1926 + 0.01509 M/s (K288A), 0.1884+0.01409 M/s (S289A), 0.1725 + 0.00748 uM /s 
(D374A), 0.1753 + 0.0072 M/s (E375A), 0.1716 + 0.00947 uM /s (Q404A) and 0.1661 + 0.005 uM/s (R567A) respectively. (D) The time-course changing 
of dsDNA unwound fraction for the WT-Nsp13. The initial ds DNA concentration is 250 nM and the protein concentration is 20 nM. The fitting function 
is one-phase association in the GraphPad Prism program. (E) The time-course changing of dsDNA unwound fraction for the three mutants including 
K288A, S289A, D374A. The initial ds DNA concentration is 250 nM and the protein concentration is 20 nM. The fitting function is one-phase association 
in the GraphPad Prism program. (F) The time-course changing of dsDNA unwound fraction for the three mutants including E375A, Q404A, R567A. 
The initial dsDNA concentration is 250 nM and the protein concentration is 20 nM. The fitting function is one-phase association in the Graphpad Prism 
program. (G) Initial unwinding velocities of WT-Nsp13 and six mutants under the dsDNA substrate concentration of 250 nM are 1.801 + 0.2308 nM/s 
(WT-Nsp13), 0.037 + 0.001212 nM/s (K288A), 0.04637 + 0.008041 nM/s (S289A), 0.03097 + 0.0049 nM/s (D374A), 0.02903 + 0.007 nM/s (E375A), 
0.04497 + 0.00208 nM/s (Q404A) and 0.0407 + 0.006129 nM/s (R567A) respectively. 


that the surrounding structure around the region 427-437 
became more tightened and less exposed to solvent. As a re- 
sult, we inferred that the 1A domain and 2A domain might 
be pulled together when the AMPPNP molecule entered the 
active pocket. The ATP-bound substrate condition repre- 
sented the finished changed state, which means the confor- 
mation change caused by ATP binding has already com- 


pleted and is ready for the next step. The nucleic acid bind- 
ing related peptides 511-542 and 496-511 showed less H/D 
exchanges indicating that more residues are involved in nu- 
cleic acid binding within these two regions. In contrast to 
the 2A domain, the 1A domain adopts a more relaxed struc- 
ture with the two peptides placed within the nucleic acid 
binding channel, 318-330 and 358-369 having more H/D 
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Figure 3. Nucleic acids binding regions. Images B, C, D and E represent results of the H/D exchange experiments recognizing the nucleic acids binding 
regions. There are four shift patterns for each peptide in different samples where the first and second row represents the shift patterns of SARS-Nsp13 
incubated with 7-fold molar and 3-fold molar excess of dsDNA respectively, the third row represents the shift pattern of SARS-Nsp13 and the last row 
represents the unexchanged pattern of SARS-Nsp13. The x-axes displays the mass-to-charge ratio of each peptide. The dashed vertical lines indicate the 
mass-to-charge ratio for each peptide in different samples. When the mass-to-charge ratio of SARS-Nsp13 incubated with nucleic acids shifts to right 
compared to that of SARS-Nsp13, more H/D exchanges in the peptides happen and vice versa. (A) Regions in red are the predicted nucleic acids binding 
related peptides based on the complex model, where the dsDNA is highlighted in blue. (B) Shift patterns for peptides 153-179. (C) Shift patterns for 
peptides 209-224. (D) Shift patterns for peptides 331-357. (E) Shift patterns for peptides 523-542. 
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Figure 4. Unwinding activity of double mutants relevant for nucleic acids binding. (A) The rectangle indicates the location of amino acid residues involved 
in nucleic acids binding and the black arrow points to the R19-820 loop. Residues in salmon belong to the 1B domain. Residues in cyan belong to the 1A 
domain. Residues in palecyan belong to the 2A domain. All residues are presented in color by elements with S atoms in orange, O atoms in red, N atoms 
in blue and H atoms in tints. (B) The binding affinity to dsDNA of SARS-Nsp13 and six double mutants are demonstrated through EMSA. The protein 
concentration is 5 uM and the dsDNA concentration is 50 nM. Protein was incubated with dsDNA for 30 min at room temperature. (C) The time-course 
changing of dsDNA unwound fraction for WT-Nsp13 and six double mutants. The initial ds DNA concentration is 400 nM and the protein concentration 
is 20 nM. The fitting function is one-phase association in the Graphpad Prism program. (D) Initial unwinding velocities of WT-Nsp13 and six double 
mutants under the dsDNA concentration of 400 nM are 1.657 + 0.6578 nM/s (WT-Nsp13), 0.9663 + 0.3265 nM/s (N179A/R212A), 0.3105 + 0.1036 


nM/s (R337A/R339A), 0.1184 + 0.04126 nM/s (K345A/K347A), 0.3797 4 


and 0.7764 + 0.4421 nM/s (S539A/Y 541A) respectively. 


exchanges. The 1A domain is now loosening its grasp on 
nucleic acid compared to the 2A domain (Figure 5B). 

The SARS-Nsp13-dsDNA sample with ADP-AIF4~ rep- 
resents a transition state of ATP hydrolysis where AIF4~ 
mimicks the gamma-phosphate transition state. The in- 
creased H/D exchange in peptides 427-437 (Supplementary 
Table S1) indicates that the 1A and 2A domains tend to be 
away from each other. Peptides 318-330 and 358-369 of the 
1A domain turn from a relaxed condition in substrate state 
(blue) to a more tightened condition in transition state (red) 
when ATP hydrolysis occurs. Taking the two facts into con- 
sideration, we can reasonably deduce that the 1A domain 
slides along the nucleic acid away from 2A domain (Figure 
5C). 

Next, data was collected for the SARS-Nsp13-dsDNA 
bound with ADP (product) and it exhibited the largest con- 
formational change since the shift D values in all related re- 
gions changed the most. Compared to the transition state, 
the 1A and 2A domains continued to be away from each 


t 0.06969 nM/s (R507A/K508A), 0.04992 + 0.2609 nM/s (K524A/Q531A) 


other according to the increased H/D exchange in peptides 
427-437. The 1A domain reaches its most tightened con- 
dition while the two nucleic acid binding related regions on 
the 2A domain 496-511 and 511-542 showed different H/D 
exchange patterns where peptides 496-511 exchanged deu- 
terium more and peptides 511-542 exchanged deuterium 
less. This suggested that the nucleic acid binding region 
moved towards the 1A domain direction at the end of an 
ATP hydrolysis cycle (Figure 5D). 


The ZBD and stalk domains are critical for the helicase ac- 
tivity of SARS-Nsp13 


The 1B, 1A and 2A domains have all been demonstrated 
to be involved in the dsDNA unwinding process directly or 
indirectly. Nevertheless, the role of the ZBD and stalk do- 
mains remained unclear. 

Among the 100 amino acids folding into the ZBD do- 
main, there are 13 cysteines and 3 histidines that play vi- 
tal roles. The first canonical Zinc Finger(ZnF1) is formed 
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Figure 5. Different states of Nsp13 with different small molecules as indicated by H/D exchange experiment results (Supplementary Table S1). A, B, C 
and D represent four different conformation states of SARS-Nsp13 with or without small molecules and they form a cycle in which ATP is hydrolysed 
by step. Regions in red suggest it experienced less H/D exchanges while regions in blue suggest it experienced more compared to the previous state in the 
cycle. (A) The initial state where Nsp13 is bound with dsDNA. (B) The substrate state where Nsp13-dsDNA is bound with ATP analog (AMPPNP). (C) 
The transition state where Nsp13-dsDNA is bound with ADP-AIF4~. (D) The product state where Nsp13-dsDNA is bound with ADP. 


by Cys5, Cys8, Cys26 and Cys29. The second Zinc Fin- 
ger (ZnF2) is made up of Cys16, Cys19, His33 and His39. 
Lastly, Cys50, Cys55, Cys72 and His75 make up the third 
Zinc Finger(ZnF3). While ZnF1 and ZnF2 are located at 
the interface between domain ZBD and 1A, ZnF3 is posi- 
tioned away and does not interact with other regions of the 
protein (Figure 6A). 

PDB coordinates of the ZBD only was uploaded into the 
DALI server (25) for structural alignment. According to the 
result, the ZBD domain of SARS-Nsp13 most resembles 
the CH domain of Upfl (hUpfl PDB code: 2WJY, DALI 
Z-score: 7.9) aside from MERS-Nsp13. 

One of the structural characteristics that makes SARS- 
Nsp13 and Upf1 different is that the CH domain (the coun- 
terpart of the ZBD domain) from Upf1 connects with the 
1B domain with a rather flexible region consisting of a long 
a helix and a disordered loop compared to the ZBD domain 
of SARS-Nsp13, structurally enabling the Upf2 to pull it 
away from blocking the nucleic acid binding channel (Fig- 
ure 6B) (12). Yet compared to Upfl, the ZBD domain of 
SARS-Nsp13 is almost inflexible due to the much shorter 
loop connecting the ZBD domain and the stalk domain. 
The way the ZBD domain packs against the stalk domain 
can also account for its fixed state. 


It is obvious that the stalk domain, composed of three 
tightly interacting a helices (a2, «3, a4) (Supplementary 
Figure $2), connects the ZBD domain with 1B domain and 
also forms a small two-sided interface for ZBD domain 
(a2—a4) and two helicase core domains, mainly 1A domain 
(a3—-a4) with hydrophobic as well as hydrophilic interac- 
tions. While lacking flexibility, this spatial arrangement of 
the stalk domain confers the ZBD domain to regulate the 
unwinding activity of SARS-Nsp13. 

We highlighted all the critical hydrophobic interaction 
related residues involving the stalk domain. Six of these 
residues including V6, L7, 120, F106, L130 and A140 clus- 
ter together on the a2—a4 interface where the ZBD do- 
main contacts with the stalk domain. How the stalk do- 
main packs against 1A domain can be attributed to all the 
following residues situated on the a3-a4 interface, 114W, 
117A, 120Y, 121I, 130L, 138L, 234P, 235L, 238P, 382Y, 
411L, 412L, 417L and 421Y (Figure 6C). 

The two residues N102 and K131, involved in hydrophilic 
interactions between domains, were replaced by alanine in 
separate experiments, in order to assess their influence on 
the helicase activity. The interaction between K131 and 
S424, situated on the a3 helix and 1A domain separately, 
provides structural stability for the stalk region. N102, sig- 
nifying the end of the ZBD domain and the beginning of the 
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Figure 6. The ZBD domain plays a critical role during the SARS-Nsp13 helicase activity cycle. (A) Key residues participating in zinc finger formation. 
ZF3 motif is highlighted in red. Three zinc atoms are presented in sphere in red. (B) How the CH domain of Upf! is rotated away through interacting with 
Upf2 (18). The CH domain is highlighted in green. The Upf2 is highlighted in salmon. The left structure represents the Upf1 only while the right structure 
represents the Upfl—Upf2 complex. (C) Hydrophobic residues involved in stalk region packing against ZBD and 1A domains respectively. Residues in 
green belong to the ZBD domain. Residues in yelloworange belong to the stalk domain. Residues in salmon belong to the 1B domain. Residues in cyan 
belong to the 1A domain. All residues are presented in color by elements with S atoms in orange, O atoms in red, N atoms in blue and H atoms in tints. 
(D) The two residues N102 and K131 involved in hydrophilic interaction in the stalk domain. Residues in green belong to the ZBD domain. Residues in 
cyan belong to the 1A domain. The stalk domain is presented in yelloworange. All residues are presented in color by elements with S atoms in orange, O 
atoms in red, N atoms in blue and H atoms in tints. (E) The time-course changing of dsDNA unwound fraction for mutants N102A and K131A. The initial 
dsDNA concentration is 400 nM and the protein concentration is 20 nM. The fitting function is one-phase association in the Graphpad Prism program. 
(F) Initial unwinding velocities of WT-Nsp13 and two single mutants under the dsDNA concentration of 400 nM are 1.643 + 0.1667 nM/s (WT-Nsp13), 
0.5119 + 0.06516 nM/s (N102A) and 0.3164 + 0.05154 nM/s (K131A) respectively. 


stalk domain, interrelates with T127, also a residue of the 
«3 helix. Both of them showed helicase function deficiency, 
but to a different degree (Figure 6D). 

Based on this analysis it can be concluded that there is 
a top-to-bottom signal transferring system started by ZBD 
domain in SARS-Nsp13 which can relay the ‘signal’ to the 
helicase core domains. 


SARS-Nsp12 can regulate the unwinding process of SARS- 
Nsp13 through direct interaction 


Both being important components of the replication and 
transcription complex (RTC) of SARS-CoV, there must be 


functional relationships between SARS-Nsp12 (RdRp) and 
SARS-Nsp13. We performed the helicase assay of SARS- 
Nsp13 alone and with SARS-Nsp12, the results of which 
revealed that SARS-Nsp12 can enhance the unwinding ac- 
tivity of SARS-Nsp13 (Figure 7A and B). The ATPase ac- 
tivity of SARS-Nsp13 alone and with SARS-Nsp12 were 
also tested and compared (Figure 7C and D). The results 
demonstrate that MERS-Nsp12 can also enhance the AT- 
Pase activity of SARS-Nsp13. 

To testify whether SARS-Nsp12 regulates the unwinding 
process of SARS-Nsp13 directly through interacting with 
SARS-Nsp13, an SPR assay was conducted. The 0.236 yM 
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Figure 7. The interaction between SARS-Nsp13 and SARS-Nsp12. (A) The unwinding activity of SARS-Nsp13 incubated with SARS-Nsp12 and MERS- 
Nsp12 is presented as the time-course changing of the dsDNA unwound fraction. The initial dsDNA concentration is 250 nM and the protein concentration 
is 20 nM. The fitting function is one-phase association in the Graphpad Prism program. (B) Initial unwinding velocities of SARS-Nsp13, SARS-Nsp13 


with SARS-Nsp12 and SARS-Nsp13 with MERS-Nsp13 under the dsDNA concentration of 400 nM are 2.078 + 0.4675 nM/s (SARS-Nsp13), 3.199 + 


0.4153 nM/s (SARS-Nsp13 incubated with SARS-Nsp12), 2.905 + 0.3589 nM/s (SARS-Nsp13 incubated with MERS-Nsp12) respectively. (C) ATPase 
activity of SARS-Nsp13, SARS-Nsp13 with SARS-Nsp12 and SARS-Nsp13 with MERS-Nsp13. The initial ATP concentration is 150 uM and the protein 
concentration is 25 nM. The changes of percentage of hydrolyzed ATP over time is demonstrated. The fitting function is one-phase association in the 
GraphPad Prism program. (D) ATP hydrolysis velocities of SARS-Nsp13 and SARS-Nsp13 incubated with SARS-Nsp12 or MERS-Nsp12 under 150 yM 
ATP concentration are 0.2401+0.01062 M/s (SARS-Nsp13), 0.3572 + 0.0329 uM/s (SARS-Nsp13 incubated with SARS-Nsp12) and 0.3373 + 0.0314 
uM/s (SARS-Nsp13 incubated with MERS-Nsp12) respectively. (E) Representative SPR sensorgrams for SARS-Nsp12 with 3.12 uM SARS-Nsp13 (blue), 
1.56 uM SARS-Nsp13 (orange), 0.78 uM SARS-Nsp13 (pink) and 0.39 uM SARS-Nsp13 (green). Association time was 60 s and dissociation time was 
60 s. The binding affinity is Kp = 236 nM. (F) SARS-Nsp12 binding regions on ZF3 motif of ZBD and 1A domains are highlighted in red. 


Ka value calculated for SARS-Nsp12 and SARS-Nsp13 
without any ligand demonstrated that there exists structural 
interaction between them, indicating that they function co- 
operatively in vivo (Figure 7C). 

Hence, this left us the necessity to identify the binding 
regions of SARS-Nsp13 to SARS-Nsp12. After compar- 
ing H/D exchange patterns of SARS-Nsp13 with SARS- 


Nsp13 bound to SARS-Nsp12 (Supplementary Figure S7), 
we identified four peptides to be responsible, peptides 44— 
56, 71-92, 262-294 and 317-330, located on the ZF3 motif 
of the ZBD domain and the 1A domain respectively (Figure 
7D). 

Interestingly, MERS-Nsp12 can also increase the helicase 
activity of SARS-Nsp13 (Figure 7A and B), suggesting that 
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the interaction between SARS-Nsp13 and SARS-Nsp12 
is conserved across CoVs. We performed the sequence 
alignment between SARS-Nsp13 and MERS-Nsp13 using 
ClustalW2 and EsPript3.0 and the conserved residues are 
highlighted in the alignment result (Supplementary Figure 
S2) (26,27). 


DISCUSSION 


Here we have solved the first structure of SARS-Nsp13 and 
have studied its molecular features that are involved in reg- 
ulating the unwinding process. 


How does the 8 19-820 loop participate in the unwinding pro- 
cess? 


We characterized the structural and functional information 
of the NTPase active site on the base between the 1A and 
2A domains and the nucleic acid binding channel formed 
by the 1B, 1A and 2A domains. Amongst the character- 
ized residues for substrate binding, four of them belong 
to the same B19-B20 loop on the 1A domain. Results of 
the H/D exchange assay showed that different from bind- 
ing ssDNA, the 819-820 loop only changed its shift pat- 
tern when incubated with dsDNA. We thus deduced that 
819-B20 loop participates in some way in the unwinding 
process rather than binding. Previous studies showed that 
virus growth in cells can be attenuated with the introduction 
of the mutation A335V for MHV-Nsp13 (Mouse Hepatitis 
Virus-Nsp13) and the viral titer in the liver of infected mice 
was reduced by 30-fold (28). The counterpart for A335 in 
MHV is A336 in SARS-Nsp13 according to the alignment 
and it is also conserved in MERS-Nsp13, HCoV-229E (Hu- 
man coronavirus 229E) and PHEV (Porcine hemaggluti- 
nating encephalomyelitis) (Supplementary Figure S2). The 
four residues critical for helicase activity R337, R339, K345 
and K347 are basic amino acids. Situated on the entrance 
of nucleic acid binding channel, they can attract negative 
charged nucleic acid into the channel. The two mutants 
R337A/R339A and K345A/K347A both provided two ad- 
ditional hydrophobic alanine residues to prevent the en- 
trance of the nucleic acid into the channel. However, A335 
provides a proper steric hindrance to destabilize and wedge 
the double-stranded part of nucleic acid unwound. 


The translocation mechanism for SARS-Nsp13 on single- 
stranded nucleic acid 


We can abstract a simple model for the SARS-Nsp13 
translocating on single-stranded nucleic acid according to 
the results of H/D exchange assays of three states namely 
substrate state, transition state and product state, where 
only the 1A and 2A domains were involved for concise- 
ness. During the translocation process of SARS-Nsp13, the 
1A and 2A domains come close to each other when ATP 
molecules enters the pocket. The 2A domain tightens the 
grasp on nucleic acid while the 1A domain loosens the grip 
on nucleic acid, ready for the next step. At the same time, 
the 1B domain (166-182) remains grasping the nucleic acid, 
helping the 1A domain to stabilize the nucleic acid. In the 
transition state when the hydrolysis occurs, the two domains 


begin to be taken apart and the more relaxed 1A domain 
slide along the nucleic acid and regrasps on it when the 1B 
domain loosen the grip on it. Lastly in the product state, 
the cleft between the 1A and 2A domains continues to en- 
large while one nucleic acid binding region (496-511) on 2A 
domain loosens and the other region (511-542) tightens. In 
addition, the 1A domain tightens its grasp on the nucleic 
acid with the 1B domain also regrasping it. 

The translocation model we abstracted for SARS-Nsp13 
is consistent with the RecD2 translocation mechanism 
which is also a member of SF1B (in 5’-3’ direction) yet with 
low sequence identity with SARS-Nsp13 (29). What is dif- 
ferent between the two models is that the 1B domain in 
SARS-NSP13 functions as a whole with the 1A domain 
while the 2B domain in RecD2 functions as a whole with its 
2A domain. This feature is in accordance with Upf1 since 
the 1B domain of Upf1 is an inserted domain in the 1A do- 
main. 


The ZBD domain is critical for helicase activity or even the 
life cycle of SARS-CoV 


Functional studies showed that Nsp13 from HCoV-229E, 
belonging to CoVs as SARS-Nsp13, displayed deleterious 
ATPase activity resulting from Cys or His residues substi- 
tutions (C5003A, C5021H, C5024A, H5028R) in the ZBD 
domain (30). It can be concluded that unlike Upf1, the ZBD 
domain of nidovirus helicase is indispensable for its cat- 
alytic activity and the interplay between ZBD domain and 
core helicase domains is finely tuned. 

In contrast to the flexible CH domain of Upf1, the ZBD 
domain of SARS-Nsp13 interacts with the stalk domain 
tightly and plays a vital role in the helicase activity. Struc- 
tural arrangement of the stalk domain with the 1A and 
ZBD domains provides strong evidence that the stalk do- 
main can be the rigid bridge transferring the influence com- 
ing from the ZBD domain onto the helicase core domains. 

The ZBD structural alignment between Upfl and SARS- 
Nsp13 showed that they are much alike (Z score, 7.9). As 
an essential component in the NMD pathway in eukaryotic 
cell, Upfl is crucial in recognizing exogenous nucleic acid 
for degradation. SARS-Nsp13 with the much alike ZBD 
domain might therefore mimic Upfl to interact with Upf2 
to help SARS-CoV to escape the host immune system. 


The interaction between SARS-Nsp12 and SARS-Nsp13 is 
critical for life cycle of CoVs 


We have substantiated that all five domains in SARS-Nsp13 
are directly or indirectly involved in helicase activity and 
are finely coordinated with each other due to the well- 
arranged structural assembly. However, the self-equilibrium 
of SARS-Nsp13 will be readjusted when it functions in vivo 
since there will be external force imposed on it. 

In the life cycle of CoVs, Nsp13 as a helicase would 
be likely to lead the replication and transcription complex 
(RTC) to unwind the double-stranded nucleic acid for self- 
reproduction. There are two facts that provide an opportu- 
nity for the first visualization of the RTC of CoVs both func- 
tionally and structurally, 1) Both SARS-Nsp12 and MERS- 
Nsp12 can enhance the helicase and ATPase activity of 
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SARS-Nsp13; 2) SARS-Nsp12 can interact with SARS- 
Nsp13 on the ZF3 motif of the ZBD domain and the 1A 
domain. 
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